Skip to main content

Last Update: 2025/3/26

LLMProvider is dedicated to offering access to a wide range of text-based AI models, currently supporting over 100 model endpoints. If you're interested in models or providers that we don't yet support, feel free to share your suggestions with us on our Discord channel.

The prices listed below are for 1M tokens. A token is the smallest unit of text processed by the model, encompassing words, numbers, and punctuation. Billing is based on the total number of tokens, including both input and output, processed by the model.

Note: Prices are subject to change and are for reference only.

OpenAI

The OpenAI provides a simple interface to state-of-the-art AI models for natural language processing, image generation, semantic search, and speech recognition. Follow this guide to learn how to generate human-like responses to natural language prompts, create vector embeddings for semantic search, and generate images from textual descriptions.

OpenAI Text Token Pricing (Per 1M Tokens)

ModelMax Output TokensInput Cost (per 1M tokens)Output Cost (per 1M tokens)Notes
openai/gpt-3.5-turbo16K$0.50$1.50
openai/gpt-3.5-turbo-012516K$0.50$1.50
openai/gpt-3.5-turbo-110616K$0.50$1.50
openai/gpt-3.5-turbo-061316K$0.50$1.50
openai/gpt-3.5-turbo-16k16K$0.50$1.50
openai/gpt-4o128K$2.50$15.00
openai/gpt-4o-2024-05-13128K$2.50$15.00
openai/gpt-4o-2024-08-06128K$2.50$15.00
openai/gpt-4o-mini128K$0.15$1.50
openai/gpt-4o-mini-2024-07-18128K$0.15$1.50
openai/gpt-4o-mini-audio-preview128K$0.15$1.50
openai/gpt-4o-mini-audio-preview-2024-12-17128K$0.15$1.50
openai/gpt-4o-mini-realtime-preview128K$0.60$1.50
openai/gpt-4o-mini-realtime-preview-2024-12-17128K$0.60$1.50
openai/gpt-4-turbo128K$10.00$30.00
openai/gpt-4-turbo-preview128K$10.00$30.00
openai/gpt-4-1106-preview128K$10.00$30.00
openai/gpt-48K$30.00$60.00
openai/gpt-4-32k32K$60.00$120.00
openai/gpt-4-vision-preview128K$10.00$30.00
openai/gpt-4o-audio-preview128K$5.00$15.00
openai/gpt-4o-audio-preview-2024-12-17128K$5.00$15.00
openai/gpt-4o-realtime-preview128K$5.00$15.00
openai/gpt-4o-realtime-preview-2024-12-17128K$5.00$15.00
openai/o1-mini128K$1.10$1.50
openai/o1-mini-2024-09-12128K$1.10$1.50
openai/o3-mini128K$1.10$1.50New
openai/o3-mini-2025-01-31128K$1.10$1.50New
openai/o1128K$15.00$15.00
openai/o1-2024-12-17128K$15.00$15.00
openai/gpt-4.5-preview128K$15.00$15.00
openai/gpt-4.5-preview-2025-02-27128K$15.00$15.00

Notes:

  1. Max Output Tokens refers to the maximum number of tokens the model can generate in a single response.
  2. Input Cost and Output Cost are calculated per million tokens and may change based on OpenAI's pricing adjustments.
  3. GPT-4o series offers better performance at a lower price compared to GPT-4-Turbo, making it suitable for most advanced tasks.
  4. GPT-3.5-Turbo is the most cost-effective option, ideal for everyday tasks.

OpenAI Transcription and speech generation Pricing

ModelUse CaseCostNotes
openai/WhisperTranscription$0.006 / minute
openai/TTSSpeech generation$15.00 / 1M characters
openai/TTS HDSpeech generation$30.00 / 1M characters

OpenAI Image Generation Pricing

ModelQuality256x256512x5121024x10241024x1792Notes
openai/DALL·E 2*$0.016$0.018$0.00*
openai/DALL·E 3Standard**$0.04$0.08
openai/DALL·E 3HD**$0.08$0.12

Embeddings Pricing

ModelPrice per 1M tokensNotes
openai/text-embedding-3-small$0.02
openai/text-embedding-3-large$0.13
openai/text-embedding-ada-002$0.10

Gemini

Gemini offers a range of advanced AI models for natural language processing, including text generation, summarization, and translation. The models are designed to generate human-like responses to text prompts, making them ideal for chatbots, content generation, and research assistance.

Gemini Text Token Pricing (Per 1M Tokens)

ModelMax Output TokensInput Cost (per 1M tokens)Output Cost (per 1M tokens)Notes
gemini/gemini-1.5-pro128K$1.25$5.00
gemini/gemini-1.5-pro-latest128K$1.25$5.00
gemini/gemini-2.0-pro-exp128K$1.25$5.00
gemini/gemini-1.5-flash128K$0.075$0.30
gemini/gemini-1.5-flash-8b128K$0.075$0.30
gemini/gemini-2.0-flash128K$0.075$0.30
gemini/gemini-1.5-flash-latest128K$0.075$0.30
gemini/gemini-2.0-flash-lite-preview128K$0.075$0.30
gemini/gemini-2.0-flash-exp128K$3.50$10.50
gemini/gemini-2.0-flash-exp128K$3.50$10.50
gemini/gemini-2.0-flash-thinking-exp128K$3.50$10.50
gemini/gemini-exp-1206128K$3.50$10.50

Anthropic

Anthropic provides a range of advanced AI models for natural language processing, including text generation, summarization, and translation. The models are designed to generate human-like responses to text prompts, making them ideal for chatbots, content generation, and research assistance.

Anthropic Text Token Pricing (Per 1M Tokens)

ModelMax Output TokensInput Cost (per 1M tokens)Output Cost (per 1M tokens)Notes
anthropic/claude-instant-1.2100K$1.00$3.00
anthropic/claude-2100K$1.50$4.50
anthropic/claude-2.0100K$1.50$4.50
anthropic/claude-3-haiku-20240307100K$1.50$4.50
anthropic/claude-3-sonnet-20240229100K$2.00$6.00
anthropic/claude-3-opus-20240229100K$2.50$7.50
anthropic/claude-3-5-sonnet-20240620100K$3.00$9.00
anthropic/claude-3-5-sonnet-20241022100K$3.50$10.50

Anthropic Claude on Amazon Bedrock (Anthropic Claude)

Anthropic Claude is now available on Amazon Bedrock, a fully managed foundation model (FM) service from AWS. This integration allows businesses to leverage Claude’s advanced natural language processing (NLP) capabilities for various AI-driven applications, including chatbots, content generation, summarization, and research assistance.

Why Use Claude on Amazon Bedrock?

  • Fully Managed Infrastructure: No need to manage servers or fine-tuning infrastructure—AWS handles deployment and scaling.
  • Customizability: Businesses can fine-tune Claude for domain-specific applications while maintaining Anthropic’s AI safety features.
  • Seamless AWS Integration: Works with AWS Lambda, Amazon S3, DynamoDB, and other AWS services for building AI-powered workflows.
  • Secure & Enterprise-Ready: Built-in compliance and security controls for enterprise applications.

Anthropic Claude Text Token Pricing (Per 1M Tokens)

ModelMax Output TokensInput Cost (per 1M tokens)Output Cost (per 1M tokens)Notes
anthropic-aws/claude-instant-1.2100K$1.00$3.00
anthropic-aws/claude-2100K$1.50$4.50
anthropic-aws/claude-2.0100K$1.50$4.50
anthropic-aws/claude-3-haiku-20240307100K$1.50$4.50
anthropic-aws/claude-3-sonnet-20240229100K$2.00$6.00
anthropic-aws/claude-3-opus-20240229100K$2.50$7.50
anthropic-aws/claude-3-5-sonnet-20240620100K$3.00$9.00
anthropic-aws/claude-3-5-sonnet-20241022100K$3.50$10.50

For more details, visit:
🔗 Anthropic on Amazon Bedrock
🔗 Claude Models on AWS

Gizmo OpenAI (Reverse Openai Web API)

Gizmo OpenAI is a service focused on reverse-engineering the OpenAI web-based model API.It fully supports All-Tools, Actions, GPTs, and related interfaces, providing efficient and stable API access for developers to integrate and utilize OpenAI's advanced features.

Gizmo ALL Tools Token Pricing (Per 1M Tokens)

ModelMax Output TokensInput Cost (per 1M tokens)Output Cost (per 1M tokens)Notes
gizmo/gizmo-gpt-4100K$30.00$60.00
gizmo/gizmo-gpt-4o100K$2.50$10.00
gizmo/gizmo-gpt-4o-mini100K$2.50$10.00
gizmo/gizmo-gpt-3.5-turbo100K$2.50$10.00
gizmo/gizmo-o1-mini100K$3.00$12.00
gizmo/gizmo-o1-preview100K$15.00$60.00
gizmo/alk-gpt-4100K$30.00$60.00alk: convert API models to a web interface for browser access
gizmo/alk-gpt-4o100K$2.50$10.00
gizmo/alk-claude-3-5-sonnet-20240620100K$15.00$60.00
gizmo/alk-claude-3-opus-20240229100K$15.00$60.00

Qwen

Qwen models outperform the baseline models of similar model sizes on a series of benchmark datasets, e.g., MMLU, C-Eval, GSM8K, MATH, HumanEval, MBPP, BBH, etc., which evaluate the models’ capabilities on natural language understanding, mathematic problem solving, coding, etc. Qwen-72B achieves better performance than LLaMA2-70B on all tasks and outperforms GPT-3.5 on 7 out of 10 tasks.

Qwen Text Token Pricing (Per 1M Tokens)

ModelMax Output TokensInput Cost (per 1K tokens)Output Cost (per 1K tokens)Notes
qwen/qwen-max32K$3.08$9.23Flagship model, strongest reasoning ability
qwen/qwen-plus131K$0.12$0.31Balanced in effect, speed, and cost
qwen/qwen-turbo1,000K$0.05$0.09Fast and very low cost for simple tasks
qwen/qwen-long10,000K$0.08$0.03Balanced in effect and speed, cost-efficient for large-scale text analysis
qwen/qwen-vl-max100K$3.08$9.23Visual models can be experienced online for image understanding capabilities
qwen/qwen-vl-max-latest100K$3.08$9.23Visual models can be experienced online for image understanding capabilities

Qwen Transcription and speech generation Pricing

ModelUse CaseCostNotes
qwen/paraformer-v2Transcription$0.0048 / minute
qwen/qwen2-audio-instructSpeech generation$15.00 / 1M characters

OpenAI Image Generation Pricing

ModelQuality1024x1024Notes
qwen/wanx-v1*$0.06/次

Embeddings Pricing

ModelPrice per 1M tokensNotes
openai/text-embedding-v1$0.10
openai/text-embedding-v2$0.10
openai/text-embedding-v3$0.10

For more details, visit:
🔗 Qwen Models
🔗 Qwen Documentation

DeepSeek

DeepSeek is an AI company specializing in large language models (LLMs). It offers models like DeepSeek Chat for general conversations and DeepSeek Coder for code generation, focusing on efficiency, scalability, and cost-effective AI solutions.

DeepSeek Text Token Pricing (Per 1M Tokens)

ModelMax Output TokensInput Cost (per 1K tokens)Output Cost (per 1K tokens)Notes
deepseek/deepseek-chat64K$0.14$0.28
deepseek/deepseek-coder64K$0.14$0.28

MiniMax

MiniMax Text Token Pricing (Per 1M Tokens)

ModelMax Output TokensInput Cost (per 1M tokens)Output Cost (per 1M tokens)Notes
llmvision/abab6.5s-chat64K$0.14$0.14
llmvision/abab6.5g-chat64K$0.68$0.68
llmvision/abab6.5t-chat64K$0.68$0.68
llmvision/abab5.5s-chat64K$0.68$0.68
llmvision/abab5.5-chat64K$1.44$1.44

LLMVersion

LLMVision specializes in cutting-edge AI models for text-to-speech (TTS), speech-to-text (STT), and immersive role-playing experiences. Designed for high-performance voice synthesis and recognition, LLMVision enhances interactive AI applications with natural, expressive, and context-aware dialogue capabilities.

LLMVision Text Token Pricing (Per 1M Tokens)

ModelMax Output TokensInput Cost (per 1M tokens)Output Cost (per 1M tokens)Notes
llmvision/SenseCharacter-20240721-0164K$0.50$1.50
llmvision/SenseCharacter-20240721-0264K$0.50$1.50
llmvision/SenseCharacter-20240721-0364K$0.50$1.50
llmvision/SenseCharacter-20240618-0164K$0.50$1.50
llmvision/SenseCharacter-20240619-0164K$0.50$1.50
llmvision/SenseCharacter-20240724-0164K$0.50$1.50
llmvision/SenseCharacter-20240809-0164K$0.50$1.50
llmvision/SenseCharacter-20240829-0164K$0.50$1.50
llmvision/SenseCharacter-20241231-0164K$0.50$1.50
llmvision/SenseCharacter-20250108-0164K$0.50$1.50
llmvision/SenseCharacter-20250124-0164K$0.50$1.50New

LLMVision Transcription and speech generation Pricing

ModelUse CaseCostNotes
llmvision/lmp-stt-20241013Transcription$0.0048 / minute
llmvision/lmp-tts-20241012Speech generation$15.00 / 1M characters

LLMVision Image Generation Pricing

ModelQuality1024x1024Notes
llmvision/ALLTools-dalle*$0.06/次

For Providers

If you’re interested in collaborating with LLMProvider, we invite you to visit our providers page to learn more about how to get involved.

Contact Us

If you have any questions, feedback, or would like to discuss potential partnerships, feel free to reach out to us via the following channels:

  • GitHub Repository – For code, issues, and contributions.
  • Discord Channel – Join our community to ask questions, share feedback, or engage with other providers and users.
  • Email Us – For direct inquiries or support.